Easy2Siksha.com
Suppose you want to study the effect of gender on wages. Gender is not a number—it’s a
category (male/female). To include it in regression, you create a dummy variable:
• Male = 1, Female = 0.
Now the regression can measure how wages differ between men and women.
Why Do We Need Dummy Variables?
Regression models require numerical inputs. But real-world data often includes qualitative
factors like gender, region, religion, or occupation. Dummy variables allow us to:
• Convert categories into numbers.
• Capture differences between groups.
• Add flexibility to regression models.
Without dummy variables, regression would ignore important qualitative influences.
Uses of Dummy Variables
1. Representing Categories
Dummy variables represent qualitative categories in regression.
• Example: Urban vs. Rural (Urban = 1, Rural = 0).
• This helps measure how living in a city affects income compared to living in a village.
2. Comparing Groups
They allow comparison between groups.
• Example: In education studies, a dummy variable can represent whether a student
attended private school (1) or government school (0).
• The coefficient shows the difference in performance between the two groups.
3. Capturing Seasonal Effects
In time series data, dummy variables can represent seasons or months.
• Example: For quarterly sales data, you can create dummies for Q1, Q2, Q3, Q4.
• This helps capture seasonal variations in sales.
4. Policy Impact Analysis
Dummy variables can represent whether a policy was implemented.
• Example: Before policy = 0, After policy = 1.
• The coefficient shows the impact of the policy on outcomes.